GRAFIX: Automated Rule-Based Post Editing System to Improve English-Persian SMT Output

نویسندگان

  • Mahsa Mohaghegh
  • Abdolhossein Sarrafzadeh
  • Mahdi Mohammadi
چکیده

This paper describes the latest developments in the PeEn-SMT system, specifically covering experiments with Grafix, an APE component developed for PeEn-SMT. The success of well-designed SMT systems has made this approach one of the most popular MT approaches. However, MT output is often seriously grammatically incorrect. This is more prevalent in SMT since this approach is not language-specific. This system works with Persian, a morphologically rich language, so post-editing output is an important step in maintaining translation fluency. Grafix performs a range of corrections on sentences, from lexical transformation to complex syntactical rearrangement. It analyzes the target sentence (the SMT output in Persian language) and attempts to correct it by applying a number of rules which enforce consistency with Persian grammar. We show that the proposed system is able to improve the quality of the state-of-the-art EnglishPersian SMT systems, yielding promising results from both automatic and manual evaluation techniques.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Three-Layer Architecture for Automatic Post-Editing System Using Rule-Based Paradigm

This paper proposes a post-editing model in which our three-level rule-based automatic post-editing engine called Grafix is presented to refine the output of machine translation systems. The type of corrections on sentences varies from lexical transformation to complex syntactical rearrangement. The experimental results both in manual and automatic evaluations show that the proposed system is a...

متن کامل

A Hybrid Machine Translation System Based on a Monotone Decoder

In this paper, a hybrid Machine Translation (MT) system is proposed by combining the result of a rule-based machine translation (RBMT) system with a statistical approach. The RBMT uses a set of linguistic rules for translation, which leads to better translation results in terms of word ordering and syntactic structure. On the other hand, SMT works better in lexical choice. Therefore, in our sys...

متن کامل

Statistical Post-Editing for a Statistical MT System

Statistical post-editing (SPE) techniques have been successfully applied to the output of Rule Based MT (RBMT) systems. In this paper we investigate the impact of SPE on a standard Phrase-Based Statistical Machine Translation (PB-SMT) system, using PB-SMT both for the first-stage MT and the second stage SPE system. Our results show that, while a naive approach to using SPE in a PB-SMT pipeline ...

متن کامل

USAAR: An Operation Sequential Model for Automatic Statistical Post-Editing

This paper presents an automatic postediting (APE) method to improve the translation quality produced by an English–German (EN–DE) statistical machine translation (SMT) system. Our system is based on Operation Sequential Model (OSM) combined with phrasedbased statistical MT (PB-SMT) system. The system is trained on monolingual settings between MT outputs (TLMT ) produced by a black-box MT syste...

متن کامل

ProphetMT: A Tree-based SMT-driven Controlled Language Authoring/Post-Editing Tools

This paper presents ProphetMT, a tree-based SMT-driven Controlled Language (CL) authoring and post-editing tool. ProphetMT employs the source-side rules in a translation model and provides them as auto-suggestions to users. Accordingly, one might say that users are writing in a ‘Controlled Language’ that is ‘understood’ by the computer. ProphetMT also allows users to easily attach structural in...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012